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AMENDMENTS TO THE CLAIMS: 

This listing of claims will replace all prior versions, and listings, of claims in the 
application: 

LISTING OF CLAIMS: 

1 . (Currently Amended) A method of displaying files within a file system 
to a user in a semantic hierarchy, the method comprising the steps of: 

mapping the files in the file system into a semantic vector space; 

clustering the files within said space, wherein multiple threshold values that 
are settable to desired levels of granularity are defined, and said files are clustered 
based on said multiple threshold values; 

deriving a hierarchy of plural levels of clusters from said clustering; and 

providing a user an option of displaying the files in a hierarchical format of 
plural levels of clusters based on said derived hierarchy , or displaying the files in a 
hierarchical format based on locations of the files in the file system . 

2. (Original) The method according to claim 1 , wherein the step of 
clustering the files is performed as a background routine during the operation of a 
computer associated with said file system. 

3. (Original) The method according to claim 2, wherein the step of 
clustering the files is performed in response to the creation of a new file within the file 
system. 
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4. (Original) The method according to claim 1 , wherein said files are text 
documents and said mapping is conducted on the basis of a language model. 

5. (Original) The method according to claim 4, wherein said mapping step 
comprises the steps of constructing a matrix which associates each word in the 
documents with a vector and associates each document with a vector. 

6. (Original) The method of claim 5, further including the step of 
decomposing said matrix to define the words and documents as vectors in a 
continuous vector space. 

7. (Original) The method of claim 5, wherein said clustering is performed 
by identifying documents whose vectors are within a threshold distance of one 
another. 

8. (Canceled) 

9. (Previously Presented) The method of claim 5 further including the step 
of automatically labeling the clusters based on the resulting clusters. 

10. (Original) The method of claim 9 wherein said labeling comprises 
selecting representative words based on the closeness of their vectors to the 
document vectors in a cluster. 
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1 1 . (Currently Amended) A computer-readable medium containing a 
graphical user interface configured to display files in a virtual file system with a 
semantic hierarchy of plural levels of clusters that is derived from semantic 
similarities of said files, clustering said files based on multiple threshold values that 
are settable to desired levels of granularity, and determining a directory structure 
having plural levels of clusters based on the clustering determined from similarities 
between said files, wherein the graphical user interface provides a user an option of 
graphically pr e s e nts displaying the determined directory structure having plural 
levels of clusters to be displayed on a display device , or displaying the files in a 
hierarchical format based on locations of the files in the virtual file system . 

12. (Canceled) 

13. (Previously Presented) The graphical user interface according to claim 
1 1 , wherein clustering of the files is initiated by user selection. 

14. (Previously Presented) The graphical user interface according to claim 
1 1 , wherein clustering of the files is initiated upon creation of a new file in the file 
system. 

15. (Previously Presented) The graphical user interface according to claim 
1 1 , wherein text files are clustered utilizing a language model and non-text files are 
clustered utilizing rule-based techniques. 
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16. (Original) The graphical user interface according to claim 15, wherein 
said language model comprises the LSA paradigm. 

17. (Currently Amended) Computer readable media having stored therein 
computer executable code for analyzing files in a file system to determine similarities 
in data pertaining to their content, clustering said files based on multiple threshold 
values that are settable to desired levels of granularity, determining a directory 
structure having plural levels of clusters based on the clustering determined from 
similarities between the files, and providing a user an option of displaying files in 
hierarchical format of plural levels of clusters based on the clustering determined 
from similarities between the files , or displaying the files in a hierarchical format 
based on locations of the files in the file system . 

1 8. (Original) The computer-readable media of claim 1 7 wherein said files 
are text documents, and the similarities are based upon the word content of the files. 

1 9. (Original) The computer-readable media of claim 1 8 wherein said 
similarities are determined in accordance with a language model, and the files are 
clustered in accordance with said model. 

20. (Original) The computer-readable media of claim 19, wherein said 
language model comprises the LSA paradigm. 
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21 . (Previously Presented) The computer-readable media of claim 1 9, 
wherein said computer-executable code performs the steps of constructing a matrix 
which associates each word in the documents with a vector and associates each 
document with a vector. 

22. (Original) The computer-readable media of claim 21 , wherein said 
computer-executable code further performs step of decomposing said matrix to 
define the words and documents as vectors in a continuous vector space. 

23. (Original) The computer-readable media of claim 22, wherein said 
computer-executable code performs clustering by identifying documents whose 
vectors are within a threshold distance of one another. 

24. (Canceled) 

25. (Previously Presented) The computer-readable media of claim 19, 
wherein said computer-executable code performs step of automatically labeling the 
clusters based on the resulting clusters. 

26. (Original) The computer-readable media of claim 25, wherein said 
labeling comprises selecting representative words based on the closeness of their 
vectors to the document vectors in a cluster. 
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27. (Previously Presented) The computer readable media according to 
claim 17, wherein the computer executable code performs the following steps: 

clustering text files within the file system using semantic similarities; 
clustering non-text files within the files system using rule-based techniques; 
labeling the resulting clusters; and 

displaying the files in a hierarchical format based on the resulting clusters and 

labels. 

28. (Currently Amended) A computer system, comprising: 
a file system storing files; 

a display device; 

a processor for analyzing the content of files stored in said file system to map 
said files into a semantic vector space, cluster the files within said space based on 
multiple threshold values that are settable to desired levels of granularity, and derive 
a hierarchy of plural levels of clusters from said clustering; and 

a user interface which provides a user an option of disp l ays displaying 
r e pr e s e ntations of files stored in said file system in the form of said derived hierarchy 
of plural levels of clusters , or displaying the files in a hierarchical format based on 
locations of the files in the file system. 



29. (Canceled) 
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30. (Previously Presented) The computer system of claim 28, wherein said 
files are text documents and said processor maps said files on the basis of a 
language model. 

31 . (Original) The computer system of claim 30 wherein said processor 
constructs a matrix which associates each word in the documents with a vector and 
associates each document with a vector. 

32. (Original) The computer system of claim 31 wherein said processor 
further decomposes said matrix to define the words and documents as vectors in a 
continuous vector space. 

33. (Original) The computer system of claim 31 , wherein said processor 
clusters the files by identifying documents whose vectors are within a threshold 
distance of one another. 

34. (Canceled). 

35. (Previously Presented) The computer system of claim 31 , wherein said 
processor automatically labels the clusters based on the resulting clusters. 

36. (Original) The computer system of claim 35 wherein said processor 
labels the clusters by selecting representative words based on the closeness of their 
vectors to the document vectors in a cluster. 
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37. (Previously Presented) The method according to claim 1 , wherein said 
deriving step includes organizing the clusters into a hierarchical directory structure. 

38. (Currently Amended) A method of organizing a plurality of documents 
in a file system , comprising: 

mapping all words of the plurality of documents in the file system and the 
plurality of documents in a semantic vector space; 

generating a plurality of clusters based on the semantic similarities of the 
plurality of documents and multiple threshold values that are settable to desired 
levels of granularity; 

organizing the plurality of clusters into directories in a hierarchical format of 
plural levels of clusters; and 

providing a user an option of displaying the plurality of documents in said 
hierarchical format of plural levels of clusters based on a result of clustering the 
plurality of documents , or displaying the documents in a hierarchical format based on 
locations of the documents in the file system . 

39. - 47. (Canceled) 

48. (Previously Presented) The method of claim 1 , wherein the multiple 
threshold values are characteristic values of clusters from said clustering. 



Attorney Docket No. P2989-908 
Application No. 10/644,815 

Page 1 0 

49. (Previously Presented) The method of claim 48, wherein the 
characteristic values of the clusters are cluster variances of the clusters. 

50. (Previously Presented) The graphical user interface according to 
claim 1 1 , wherein the multiple threshold values are characteristic values of clusters 
from said clustering. 

51 . (Previously Presented) The graphical user interface according to claim 
50, wherein the characteristic values of the clusters are cluster variances of the 
clusters. 

52. (Previously Presented) The computer-readable media of claim 17, 
wherein the multiple threshold values are characteristic values of clusters from said 
clustering. 

53. (Previously Presented) The computer-readable media of claim 52, 
wherein the characteristic values of the clusters are cluster variances of the clusters. 

54. (Previously Presented) The computer system of claim 28, wherein 
the multiple threshold values are characteristic values of clusters from said 
clustering. 

55. (Previously Presented) The computer system of claim 54, wherein the 
characteristic values of the clusters are cluster variances of the clusters. 
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56. (Previously Presented) The method of claim 38, wherein the 
multiple threshold values are characteristic values of clusters from said clustering. 

57. (Previously Presented) The computer of claim 56, wherein the 
characteristic values of the clusters are cluster variances of the clusters. 

58. (Previously Presented) The method of claim 1 , further comprising 
providing a user an option to reorganize the files in the file system according to the 
derived hierarchy. 



